[Frontend] Automatic detection of chat content format from AST #9919

DarkLight1337 · 2024-11-01T15:11:36Z

This PR renames --chat-template-text-format (introduced by #9358) to --chat-template-content-format and moves it to the CLI parser specific to OpenAI-compatible server. Also, it removes the redundant hardcoded logic for Llama-3.2-Vision (last updated by #9393) since we can now run online inference with --chat-template-content-format openai.

To avoid causing incompatibilities with how users are currently serving Llama-3.2-Vision, I have added code to automatically detect the format to use based on the AST of the provided chat template.

cc @vrdn-23 @ywang96 @heheda12345 @alex-jw-brooks

FIX #10286

github-actions · 2024-11-01T15:11:49Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

vrdn-23 · 2024-11-01T17:13:25Z

Great idea with the PR @DarkLight1337 !
The problem with auto-detecting is a lot of chat templates do not throw errors with jinja even if they do not fit into the right format which is what made the bug in #9294 so subtle. The content string was just not being looped over and no content was being added to the conversation. I'm not completely familiar with how jinja works so if you figure out a way to detect this, let me know and I can help out!

DarkLight1337 · 2024-11-02T02:47:20Z

Great idea with the PR @DarkLight1337 !
The problem with auto-detecting is a lot of chat templates do not throw errors with jinja even if they do not fit into the right format which is what made the bug in #9294 so subtle. The content string was just not being looped over and no content was being added to the conversation. I'm not completely familiar with how jinja works so if you figure out a way to detect this, let me know and I can help out!

Right now I am thinking of using Jinja's AST parser and working off that. The basic idea is to detect whether messages[int]['content'] is being treated as a string or a list of dictionaries.

DarkLight1337 · 2024-11-02T13:29:01Z

vllm/entrypoints/chat_utils.py

+def _is_var_access(node: jinja2.nodes.Node, varname: str) -> bool:
+    if isinstance(node, jinja2.nodes.Name):
+        return node.ctx == "load" and node.name == varname
+
+    return False
+
+
+def _is_attr_access(node: jinja2.nodes.Node, varname: str, key: str) -> bool:
+    if isinstance(node, jinja2.nodes.Getitem):
+        return (node.ctx == "load" and _is_var_access(node.node, varname)
+                and isinstance(node.arg, jinja2.nodes.Const)
+                and node.arg.value == key)
+
+    if isinstance(node, jinja2.nodes.Getattr):
+        return (node.ctx == "load" and _is_var_access(node.node, varname)
+                and node.attr == key)
+
+    return False
+
+
+def _iter_nodes_define_message(chat_template_ast: jinja2.nodes.Template):
+    # Search for {%- for message in messages -%} loops
+    for loop_ast in chat_template_ast.find_all(jinja2.nodes.For):
+        loop_iter = loop_ast.iter
+        loop_target = loop_ast.target
+
+        if _is_var_access(loop_iter, "messages"):
+            assert isinstance(loop_target, jinja2.nodes.Name)
+            yield loop_ast, loop_target.name
+
+
+def _iter_nodes_define_content_item(chat_template_ast: jinja2.nodes.Template):
+    for node, message_varname in _iter_nodes_define_message(chat_template_ast):
+        # Search for {%- for content in message['content'] -%} loops
+        for loop_ast in node.find_all(jinja2.nodes.For):
+            loop_iter = loop_ast.iter
+            loop_target = loop_ast.target
+
+            if _is_attr_access(loop_iter, message_varname, "content"):
+                assert isinstance(loop_target, jinja2.nodes.Name)
+                yield loop_iter, loop_target.name
+
+
+def _detect_content_format(
+    chat_template: str,
+    *,
+    default: _ChatTemplateContentFormat,
+) -> _ChatTemplateContentFormat:
+    try:
+        jinja_compiled = hf_chat_utils._compile_jinja_template(chat_template)
+        jinja_ast = jinja_compiled.environment.parse(chat_template)
+    except Exception:
+        logger.exception("Error when compiling Jinja template")
+        return default
+
+    try:
+        next(_iter_nodes_define_content_item(jinja_ast))
+    except StopIteration:
+        return "string"
+    else:
+        return "openai"


This handles the most common case of iterating through OpenAI-formatted message['content'] as a list, ~~assuming that no relevant variable reassignments are made other than those in the for loops.~~

Please tell me if you are aware of any chat templates that don't work with this code.

DarkLight1337 · 2024-11-02T13:32:17Z

vllm/entrypoints/chat_utils.py

@@ -380,10 +521,7 @@ def load_chat_template(

        # If opening a file fails, set chat template to be args to
        # ensure we decode so our escape are interpreted correctly
-        resolved_chat_template = codecs.decode(chat_template, "unicode_escape")
-
-    logger.info("Using supplied chat template:\n%s", resolved_chat_template)


Thie logging line has been moved to vllm/entrypoints/openai/api_server.py.

DarkLight1337 · 2024-11-02T13:54:47Z

vllm/entrypoints/openai/protocol.py

+    chat_template: Optional[str] = Field(
+        default=None,
+        description=(
+            "A Jinja template to use for this conversion. "
+            "As of transformers v4.44, default chat template is no longer "
+            "allowed, so you must provide a chat template if the tokenizer "
+            "does not define one."),
+    )
+    chat_template_kwargs: Optional[Dict[str, Any]] = Field(
+        default=None,
+        description=("Additional kwargs to pass to the template renderer. "
+                     "Will be accessible by the chat template."),
+    )


These arguments are present in other chat-based APIs so I added them here as well.

mergify · 2024-11-02T16:58:44Z

This pull request has merge conflicts that must be resolved before it can be
merged. @DarkLight1337 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: DarkLight1337 <[email protected]>

mergify · 2024-11-14T04:15:52Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @DarkLight1337.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 · 2024-11-14T23:16:14Z

@maxdebayser does this look good to you now?

maxdebayser

@DarkLight1337 , I've left a few comments, I think the one about the assignment search is worth of your consideration but other than that it looks good to me.

vllm/entrypoints/chat_utils.py

Signed-off-by: DarkLight1337 <[email protected]>

mergify · 2024-11-15T00:57:55Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @DarkLight1337.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: DarkLight1337 <[email protected]>

njhill · 2024-11-15T16:31:34Z

@DarkLight1337 looks like there's one test failure remaining

DarkLight1337 · 2024-11-15T16:42:41Z

The network is quite slow right now (HF keeps timing out for a lot of other PRs). This error comes from not being able to download the video before timeout occurs. (It passes when I run it locally.) Can you approve this PR? Then I'll retry the CI once the network returns to normal.

njhill

Thanks @DarkLight1337 @maxdebayser!

…project#9919) Signed-off-by: DarkLight1337 <[email protected]>

…project#9919) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>

…project#9919) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: rickyx <[email protected]>

…project#9919) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>

…project#9919) Signed-off-by: DarkLight1337 <[email protected]>

Write out skeleton code

d484401

mergify bot added documentation Improvements or additions to documentation frontend labels Nov 1, 2024

DarkLight1337 marked this pull request as ready for review November 2, 2024 12:44

DarkLight1337 requested review from robertgshaw2-neuralmagic, simon-mo, WoosukKwon, zhuohan123, youkaichao, alexm-neuralmagic, comaniac and njhill as code owners November 2, 2024 12:44

DarkLight1337 commented Nov 2, 2024

View reviewed changes

DarkLight1337 changed the title ~~[Frontend] Rename and auto-detect --chat-template-text-format~~ [Frontend] Automatic detection of chat template content format using AST parsing Nov 2, 2024

DarkLight1337 changed the title ~~[Frontend] Automatic detection of chat template content format using AST parsing~~ [Frontend] Automatic detection of chat content format from AST Nov 2, 2024

DarkLight1337 force-pushed the chat-template-content-format branch from 8ce013b to e262745 Compare November 2, 2024 16:58

DarkLight1337 requested review from tlrmchlsmth and ywang96 as code owners November 2, 2024 16:58

mergify bot added the ci/build label Nov 2, 2024

mergify bot added the needs-rebase label Nov 2, 2024

DarkLight1337 force-pushed the chat-template-content-format branch from e262745 to c37af03 Compare November 2, 2024 16:59

mergify bot removed the needs-rebase label Nov 2, 2024

DarkLight1337 removed request for tlrmchlsmth and comaniac November 5, 2024 02:49

DarkLight1337 added 2 commits November 13, 2024 13:20

Merge branch 'main' into chat-template-content-format

a75d813

Recurse into var assignment

03f6e98

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot added the needs-rebase label Nov 14, 2024

Merge branch 'main' into chat-template-content-format

d98735e

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot removed the needs-rebase label Nov 14, 2024

maxdebayser approved these changes Nov 15, 2024

View reviewed changes

vllm/entrypoints/chat_utils.py Outdated Show resolved Hide resolved

vllm/entrypoints/chat_utils.py Outdated Show resolved Hide resolved

vllm/entrypoints/chat_utils.py Outdated Show resolved Hide resolved

DarkLight1337 added 2 commits November 15, 2024 00:06

Fix redundant check

c8a6a75

Signed-off-by: DarkLight1337 <[email protected]>

Use iterative BFS

1ea0b37

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot added the needs-rebase label Nov 15, 2024

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 15, 2024

Merge branch 'main' into chat-template-content-format

ea474fa

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot removed the needs-rebase label Nov 15, 2024

DarkLight1337 mentioned this pull request Nov 15, 2024

[RFC]: Multi-modality Support on vLLM #4194

Open

71 tasks

njhill approved these changes Nov 15, 2024

View reviewed changes

DarkLight1337 merged commit 32e46e0 into main Nov 16, 2024
52 checks passed

DarkLight1337 deleted the chat-template-content-format branch November 16, 2024 05:35

tjohnson31415 mentioned this pull request Nov 19, 2024

[Bugfix][Frontend] Update Llama Chat Templates to also support Non-Tool use #10164

Merged

coolkp pushed a commit to coolkp/vllm that referenced this pull request Nov 20, 2024

[Frontend] Automatic detection of chat content format from AST (vllm-…

cdc711f

…project#9919) Signed-off-by: DarkLight1337 <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Frontend] Automatic detection of chat content format from AST (vllm-…

6adb486

…project#9919) Signed-off-by: DarkLight1337 <[email protected]>

mfournioux pushed a commit to mfournioux/vllm that referenced this pull request Nov 20, 2024

[Frontend] Automatic detection of chat content format from AST (vllm-…

5fc8bcf

…project#9919) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>

rickyyx pushed a commit to rickyyx/vllm that referenced this pull request Nov 20, 2024

[Frontend] Automatic detection of chat content format from AST (vllm-…

9775a00

…project#9919) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: rickyx <[email protected]>

prashantgupta24 pushed a commit to opendatahub-io/vllm that referenced this pull request Dec 3, 2024

[Frontend] Automatic detection of chat content format from AST (vllm-…

2ca5912

…project#9919) Signed-off-by: DarkLight1337 <[email protected]>

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[Frontend] Automatic detection of chat content format from AST (vllm-…

cd5144f

…project#9919) Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 mentioned this pull request Jan 9, 2025

[Misc] Provide correct Pixtral-HF chat template #11891

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend] Automatic detection of chat content format from AST #9919

[Frontend] Automatic detection of chat content format from AST #9919

DarkLight1337 commented Nov 1, 2024 •

edited

Loading

github-actions bot commented Nov 1, 2024

vrdn-23 commented Nov 1, 2024

DarkLight1337 commented Nov 2, 2024 •

edited

Loading

DarkLight1337 Nov 2, 2024 •

edited

Loading

DarkLight1337 Nov 2, 2024

DarkLight1337 Nov 2, 2024

mergify bot commented Nov 2, 2024

mergify bot commented Nov 14, 2024

DarkLight1337 commented Nov 14, 2024

maxdebayser left a comment

mergify bot commented Nov 15, 2024

njhill commented Nov 15, 2024

DarkLight1337 commented Nov 15, 2024 •

edited

Loading

njhill left a comment

[Frontend] Automatic detection of chat content format from AST #9919

[Frontend] Automatic detection of chat content format from AST #9919

Conversation

DarkLight1337 commented Nov 1, 2024 • edited Loading

github-actions bot commented Nov 1, 2024

vrdn-23 commented Nov 1, 2024

DarkLight1337 commented Nov 2, 2024 • edited Loading

DarkLight1337 Nov 2, 2024 • edited Loading

Choose a reason for hiding this comment

DarkLight1337 Nov 2, 2024

Choose a reason for hiding this comment

DarkLight1337 Nov 2, 2024

Choose a reason for hiding this comment

mergify bot commented Nov 2, 2024

mergify bot commented Nov 14, 2024

DarkLight1337 commented Nov 14, 2024

maxdebayser left a comment

Choose a reason for hiding this comment

mergify bot commented Nov 15, 2024

njhill commented Nov 15, 2024

DarkLight1337 commented Nov 15, 2024 • edited Loading

njhill left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Nov 1, 2024 •

edited

Loading

DarkLight1337 commented Nov 2, 2024 •

edited

Loading

DarkLight1337 Nov 2, 2024 •

edited

Loading

DarkLight1337 commented Nov 15, 2024 •

edited

Loading